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AN OPTIMAL SUBGRADIENT ALGORITHM FOR LARGE-SCALE 
CONVEX OPTIMIZATION IN SIMPLE DOMAINS 

MASOUD AHOOKHOSH* AND ARNOLD NEUMAIER+ 

Abstract. This paper shows that the optimal subgradient algorithm, OSGA, proposed in [59] 
can be used for solving structured large-scale convex constrained optimization problems. Only first- 
order information is required, and the optimal complexity bounds for both smooth and nonsmooth 
problems are attained. More specifically, we consider two classes of problems: (i) a convex objective 
with a simple closed convex domain, where the orthogonal projection on this feasible domain is 
efficiently available; (ii) a convex objective with a simple convex functional constraint. If we equip 
OSGA with an appropriate prox-function, the OSGA subproblem can be solved either in a closed 
form or by a simple iterative scheme, which is especially important for large-scale problems. We 
report numerical results for some applications to show the efficiency of the proposed scheme. A 
software package implementing OSGA for above domains is available. 
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1. Introduction. Convex optimization has been shown to provide efficient al¬ 
gorithms for computing reliable solutions in a broad range of applications. Many 
applications arising in applied sciences and engineering such as signal and image pro¬ 
cessing, machine learning, statistics, and general inverse problems can be addressed 
by a convex optimization problem involving high-dimensional data. In practice, solv¬ 
ing a nonsmooth convex problem is usually more difficult and costly than a smooth 
one. More precisely, for a prescribed accuracy parameter e, the optimal complexity to 
achieve an e-solution of nonsmooth Lipschitz continuous problems is 0(e~ 2 ), the su¬ 
perior complexity for smooth problems with Lipschitz continuous gradient, 

see [52, 53], 

Thanks to the low memory requirement and simple structure, first-order meth¬ 
ods have received much attention during the past few decades. Indeed, they deal 
successfully with large-scale problems. In general, convex optimization problems can 
be solved by gradient-type algorithms [3, 21, 22, 38], conjugate gradient methods 
[41, 45, 46] and spectral gradient methods [12, 23, 63] for smooth objectives and by 
subgradient-type methods [27, 51, 57], proximal gradient methods [62, 32], smoothing 
techniques [15, 24, 34, 55], bundle-type algorithms [48, 49], and primal-dual first- 
order methods [25, 26, 28] for nonsmooth objectives. Moreover, both classes can be 
addressed by (zero-order) coordinate descent methods and derivative-free methods. 
The current paper only addresses first-order methods and assumes that first-order 
black-box information - function values and subgradients - of the objective function 
are available. 

Historically, gradient descent and subgradient methods were the first numerical 
schemes proposed to solve optimization problems with smooth and nonsmooth convex 
objective functions, respectively. In practice, they are too slow, especially for badly 
scaled problems. This can be addressed by their worst-case complexity bounds to 
reach an e-solution, while the gradient descent method achieve the complexity of the 
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order Oi^e^ 1 ) which is not optimal for smooth problems, the subgradient methods at¬ 
tain the worst-case complexity of the order 0{e~ 2 ). In 1983, Nemirovski & Yudin 
in [52] derived optimal worst-case complexity bounds of first-order methods to achieve 
an e-solution for several class of problems such as Lipschitz continuous nonsmooth 
problems and smooth problems with Lipschitz continuous gradient. If an algorithm 
attains the optimal worst-case complexity bound for a class of problems, it is called 
optimal. Optimal first-order methods dating back to Nesterov [54] in 1983. This 
optimal first-order method is interesting both theoretically and computationally, at¬ 
tracting many researchers to work in the development of such schemes, for example 
Auslander & Teboulle [9], Beck & Teboulle [16], DEVOLDERet al. [33], Gon- 
zaga et al. [39, 40], Lan [49], Lan et al. [50], Nesterov [55, 56, 58], Neumaier 
[59] and Tseng [65]. Computational comparisons for composite functions show that 
optimal Nesterov-type first-order methods are substantially superior to the gradient 
descent and subgradient methods, see, for example, Ahookhosh [1] and Becker et 
al. [18]. 

Content. In this paper we consider structured convex constrained optimization 
problems frequently observed in applications and develop OSGA to efficiently solve 
such problems. Two clasess of convex domains are considered, namely, simple convex 
domains such that the orthogonal projection is cheaply feasible, and sublevel set of a 
convex function referred as functional domain. For problems with a simple domain, 
we first introduce an appropriate prox-function and then show that the solution of 
OSGA’s subproblem is obtained by a projection on the domain followed by solving a 
one-dimensional nonlinear equation. It is shown that if explicit formula for projection 
is available, the nonlinear equation can be solved in a closed form in many interesting 
cases. We also establish the optimality condition for functional domain and show for 
some simple functions that results to in a closed form solution. Finally, we report 
some numerical results for applications to show the efficiency OSGA in comparison 
with some state-of-the-art algorithms. 

The remainder of this paper is organized as follows. In the next section, we review 
the basic idea of OSGA. Section 3 considers the structured convex constrained min¬ 
imization and how to solve the associated OSGA subproblem. We report numerical 
results in Section 4 and our conclusions are derived in Section 5. 

Notation and preliminaries. Let V be a real finite-demensional vector space 
endowed with the norm || • ||, and V* denotes the dual space of all linear functional 
on V where the bilinear pairing (q.x) denotes the value of the functional q £ V* at 
x £ V. If V =R", then 

INI 2 (il N 2 

\i=l 

If x £ M mxra , then the Schatten oo-norm is ||cr(a;)|| 00 where a : R mxn —> g m m{m,n} 
is the function that takes a matrix x £ R mxn and returns a vector of singular values 
in nonincreasing order. If a: is a positive definite matrix, we denotes it by x )>= 0. We 
also denote by x = XoLi A,«,///' and x = ]G" =1 <x iUivJ the eigenvalue decomposition 
and the singular value decomposition of x. For a function / : V —> 1 = lU {±oo}, 
we denote by 



dom/ := {a; £ V | f(x) < +oo} 
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its effective domain and call / proper if dom/ 7 ^ 0 and f(x) > —00 for all x £ V. 
The vector g £ V* is called a subgradient of / at x if f(x) £ R. and 

f{y) > /( X) + {g,y- x) for all y £ V. 


The set df(x) of all subgradients is called the subdifferential of / at x. 

We call a nonempty, closed, and convex subset C of V a simple convex domain 
if the orthogonal projection 


P c(y) ■= argmin -\\x - ; 
xec * 


( 1 . 1 ) 


of y to C can be found efficiently for every y £ V. Note that Pc{y) is unique since 
|||x — y\\ 2 is strongly convex. Computing the orthogonal projection is a well-studied 
topic on convex optimization, and the projection operator is available for many do¬ 
mains C either in a closed form or by a simple iterative scheme. Table 1 gives some 
practically interesting convex domains, associated projection operators, and references 
for the formulas or iterative schemes. 


Table 1.1: List of some available projection operators for C = {x £ V \ c(x)} 


defining constraint c(x ) 

Projection operator 

Ref. 

Ax = b 

u = y — (Ay — b) 

[62] 

(a, x) = b 

u = y- ((“, y) - &)/(IMIl) a 

[13] 

(a, x) < b 

u = y- (( a,y) - 6)+/(||o|||) a 

[13] 

(a, #)| < b 

( y if \(a,y)\ < b 

U =S y + (b-(a,y))/{\\a\\l) a if {a,y)>b 

{ y + (-b- (a, 2 /))/(||a||!) a if {a,y) < -b 

[13, 14] 

b < Ax < b 

U = X- Eill *i(x)/{\\ A i:\\l) A i:, 

f 0 if b t < ( Ai : ,x ) < bi, 

Ai(x) := < (A i: ,x) —bi if (A i: ,x)>bi, 

1 (Ai-^x)-^ if x i >(A i: ,x). 

[13] 

x G [x, x] 

u = sup{:r, inf{y, #}} 

[13] 

x>0 

U = (y)+ ■■= max(j/, 0) 

[62] 

w 

VI 

TT 

iterative scheme 

[36, 62] 

U/> 

VI 

<N 

5 

u _l Zv/Wvh if II 2 /II 2 >£ 

1 y if II 2 /II 2 < £ 

[13] 

||*^||oo ^ £ 

u = sup {—£/, inf{y, £/}} 

[62] 

{(x,t) 1 ||x ||2 < t} 

| 0 if 11 y11 2 < -t 

U =S (y,t) if ||v||a < * 

{ 1 / 2 ( 1 + t/||i/|| 2 )(y,||y|| 2 ) if Hvlla > \t\ 

[13] 

Exponential cone 

iterative scheme 

[62] 

Epigraphs 

iterative scheme 

[13] 

Sublevel sets 

iterative scheme 

[13] 

Simplex 

iterative scheme 

[62] 

X 0, X = £r=i A iUiuJ 


[62] 

x 0, tr(x) = 1 

iterative scheme 

[62] 

lk(®)||oo < 1,® = E”=l °i u i v T 

u = E" =1 max(Aj, 1 )v,iuj 

[62] 
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2. A review of OSGA. In what follows we briefly review the main idea of 
optimal subgradient algorithm proposed by Neumaier in [59]. To this end, we first 
consider the convex constrained minimization problem 

min f(x) , , 

s.t. x £ C, 1 j 

where / : C —> ffi. is a convex function defined on a nonempty, closed and convex 
subset C of V. The main objective is to find a solution u £ C by using the first-order 
information, i.e., function values and subgradients. 

OSGA (see Algorithm 1) is an optimal subgradient algorithm for problem (2.1) 
that constructs a sequence of iterations whose related function values converge to 
the minimum with the optimal complexity. Moreover, OSGA requires no information 
regarding global parameters such as Lipscliitz constants of function values and gradi¬ 
ents. The primary objective is to monotonically reduce bounds on the error f(xb ) — / 
of function values, where / is the minimum and Xb is the best known point. 

OSGA considers the linear relaxations 

f(z) >7 + (h, z) for all z £ C, (2.2) 

of / at z, where 7 € R and h £ V*, and a continuously differentiable prox-function 
Q : C —> K. satisfying 


Qo := inf Q{z) > 0 

2GC 


(2.3) 


and 


Q{z) > Q{x) + { 9 q{x),z - x) + ^\\z - x\\ 2 for all x, z G C, (2.4) 

where cr = 1, gQ{x) denotes the gradient of Q at x € C and || • || is a norm defined on 
V. OSGA solves a sequence of minimization problems of the form 

sup E lth (x) , , 

s.t. x £ C, ' 

where it is known that the supremum is positive. The function E lt h '■ C —> R is 
defined by 




7 + (h,x) 
Q(x) 


( 2 . 6 ) 


If u = U{ r ),h) £ C is the solution of this problem, then it is assumed that e = E( 7 , h) 
and u = U ( 7 , h) are readily computable. 

In [59], it is shown that OSGA attains the following bound on function values 


0 < f{x b ) - f < VQ{x). 


Hence, by decreasing the error factor 77 , the convergence to an e-minimizer 27 is 
guaranteed by 


0 < f(x b ) - f <£, 
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for the accuracy tolerance e > 0. In [59], it is shown that the number of iterations to 
achieve the optimizer is in the order O (e -1 / 2 ) for smooth / with Lipschitz continuous 
gradients and in the order O (e -2 ) for Lipschitz continuous nonsmooth /, which is 
optimal in both cases, cf. Nemirovsky & Yudin [52] and Nesterov [53]. The 
algorithm does not need to know about the global Lipschitz parameters and has the 
low memory requirement. Hence if the subproblem (2.5) can be solved efficiently, 
OSGA is appropriate for solving large-scale problems. Numerical results reported 
by Ahookhosh in [1] and Ahookhosh & Neumaier in [4, 5], for unconstrained 
problems, and Ahookhosh & Neumaier in [ 6 , 7], for constrained problems, show 
the promising behavior of OSGA for practical problems. In the next section we show 
that by selecting a suitable prox-function, OSGA’s subproblem (2.5) can be solved 
efficiently for structured convex constrained problems. 


Algorithm 1: OSGA (optimal subgradient algorithm) 

Input: S, Umax £ ]0,1[, 0 < k' < k; local parameters: Xq, p > 0, /target! 

Output: x b , f Xb ; 

begin 

choose an initial best point x b ; 
compute f Xb and g Xb , 

if fx b < /target then 

stop; 

else 

h = g Xb - ggQ(x b ); 7 = f Xb - pQ(x b ) - (h,x b )\ 

7b =l-fx b \ u = U{'y b ,h)\ r) = E^h) - 11 ; 

end 

CX t CX max 5 

while stopping criteria do not hold do 
x = x b + a{u — x b ); compute f x and g x \ 

9 = 9x~ 99q(x)-, h = h + a(g - h); 

7 = 7 + a(f x - nQ{x :) - {g, x) - 7 ); 

x’ b = argmin ze{x()iX} f(z , u*); f x > b = min {f Xb ,f x }; 

7 b =7 ~ fx' b \ u' = U(y b ,h); 

x' = x b + a{u' — x b ); compute f x '\ 

choose x b in such a way that f Xb < min{/ x /, f x >}; 

7b =7 - fx b ; u = U(%,h)-, rj = E(j b ,h) — p; x b =x b \ f Xb = f Xb ; 

if fx b < /target then 

stop; 

else 

update the parameters a, h, 7 , 7 and u using UPS; 

end 

end 

end 


As discussed in [59], OSGA uses the following scheme for updating the given pa¬ 
rameters a, h, 7 , 7 and u: 
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Algorithm 2: PUS (parameters updating scheme) 

Input: 5, a max G ]0,1[, 0 < k' < k, a, g, h, 7 , fj, u; 
Output: a, h, 7 , 77 , u; 

begin 

R^- ( rj-rj)/(6ar 7 ); 

if R < 1 then 

h i — h j 

else 

a G- min(ae K ' (i?_1) , a max ); 

end 

cn ^— Q5 

if 77 < 77 then 

h G- hf, 7 G- 7 ; 77 G- 77 ; u <— u\ 

end 

end 


3. Structured convex constrained problems in simple domains. In this 
paper we consider the convex constrained optimization problem 

min f(Ax) 

s.t. ieC, v 

where f : C —> K is convex and lower semicontinuous, A : is a linear 

operator, and C is a simple convex domain. We call problem (3.1) a simple do¬ 
main problem. This problem appears in many applications such as signal and image 
processing, machine learning, statistics, and inverse problem. 

Example. 3.1. (Image restoration) The process of reconstructing or esti¬ 
mating a true image from a degraded observation is known as the image restoration, 
also called deblurring or deconvolution. Image restoration is addressed by solving a 
constraint satisfaction problem of the form 

Ax = 6 , x G C, 


where C a convex domain C that is commonly a box or the nonnegativity constraint. 
This is an ill-posed problem, see Neumaier [60], and normally handled by the regu¬ 
larized least-squares problem 


min ^\\Ax — &||| + \ip(x) 
s.t. x G C 


(3.2) 


or the regularized l\ problem 


min \\Ax — b\\i + Xip(x) 
s.t. x G C, 


(3.3) 


where p : C —> ffi. is a convex regidarization function such as || • H 2 , || • ||i, || • ||/tv, 
and || • H. 4 TV- The regularizers || • \\itv an d II • ||atv are respectively called isotropic 
and anisotropic total variation, see, for example, [29], where they are defined by 



+ e r 


1 

1 


|-^Q+l,n XXi, n | H - 


INI ITV 
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and 

IMUtV = Xw X 7 {l^i+lj — x i,j I + — I} 

A X^i \ x i+l,n x i,n\ A X^i |^'m,. 7 +l x m,j\y 

for x £ R mxn . 

Example. 3.2. (Basis pursuit problem) Let A : K ra -A R m be a linear 
operator with m < n and y € R m . The basis pursuit problem is the constrained 
minimization problem 

min INI, 

s.t. Ax = y, v ' 

which determines an li-minimal solution x of the undetermined linear system Ax = y. 
This problem appears in many applications such signal and image processing and 
compressed sensing, see [19, 20, 31, 35, 67, 68, 69] and references therein. 

According to the features of objective functions, (3.2) can be solved by Nesterov- 
type optimal methods, however, (3.3) and (3.4) cannot be solved by Nesterov-type 
optimal methods. Since OSGA only needs first-order information, it can deal with all 
of these problems without considering the structure of problems. In the remainder of 
this section, we establish how OSGA can be used to efficiently solve the problem (3.1). 
Since the underlying problem (3.1) is a special case of the problem (2.1) considered in 
[59], the complexity of OSGA remains valid for both smooth and nonsmooth problems. 
The quadratic function 

Q( z ) := 2 INli + Q 01 (3-5) 

is a prox-function, see e.g. [1]. We now show that the solution of OSGA’s subproblem 
(2.5) can be found either in a closed form or by a simple iterative scheme. In particular, 
we address some convex domains that a closed form solution for associated OSGA’s 
subproblem (2.5) can be found. 

The next result shows that the solution of the auxiliary subproblem (2.5) is given 
by the orthogonal projection (1.1) of y := e~ 1 h on the domain C followed by solving 
a one-dimensional nonlinear equation to determine e. 

Theorem 3.3. Let u be a minimizer of (2.5) and also let e = E 1} h(u) > 0. Then 

u = u(e) := P c (y), y := —e~ 1 h, 

where, e is a solution of the univariate equation 

<p(e) = 0 


with 

7 >{e) := e Q||u(e)||^ + Q^j +7 + {h,u{e)). (3.6) 

Proof. From Proposition 5.1 in [59], at the minimizer u, we obtain 

eQ{u) = -7 - (h, u) (3.7) 

and 

(eu + h,z — u) >0 for all z € C. (3.8) 
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By setting z = u in this variational inequality, it follows that it is a solution of the 
minimization problem 

inf (eu + h.z — u). 

zee 

The first-order optimality condition for this problem is 

0 &eu + h + Nc(u), (3.9) 

where 


N c (u) := {p G V | \/y G C, ( p , u - y) > 0} 
denotes the normal cone to C at u. Since e > 0, u satisfies 

u = argmin ^\\ez + h\\\ = argmin *\\z - y\\% = P c (y) =u(e), 
zee £ zee £ 

where y = —e~ 1 h giving the result. □ 

Theorem 3.3 gives a way to compute a solution of OSGA’s subproblem (2.5) 
involving a projection on the domain C and solving the one-dimensional nonlinear 
equation. This equation can be solved exactly for some projection operators, see 
Table 2. However, one can solve this nonlinear equation approximately using zero 
finding schemes, see e.g. Chapter 5 of [61]. We apply the results of Theorem 3.3 in 
the next scheme to solve OSGA’s subproblem (2.5): 


Algorithm 3: OSS (OSGA’s subproblem solver) 

Input: Q o, 7 , h. a program for evaluating tp(e) defined in (3.6); 

Output: u, e; 
begin 

solve the nonlinear equation t/?(e) = 0 either in a closed form or 
approximately by a root finding solver; 
set u = u(e). 

end 


To implement Algorithm 3 (OSS), we first need to solve the projection problem 
(1.1) effectively, see Table 1.1. If one solves the equation ip(e) = 0 approximately, and 
an initial interval [a, b] is available such that ip(a)ip(b) < 0, then a solution can be 
computed to £-accuracy using the bisection scheme in 0(log 2 ((fe — a)/e)) iterations, 
see, for example, [61]. However, it is preferable to use a more sophisticated zero 
finder like the secant bisection scheme (Algorithm 5.2.6, [61]). If an interval [a, b] 
with sign change is available 1 , one can also use MATLAB’s fzero function combining 
the bisection scheme, the inverse quadratic interpolation, and the secant method. 

In the following we investigate special domains C, where the nonlinear equation 
if(e) = 0 can be solved explicitly, see Table 2. 

Proposition 3.1. If C = {x £ V \ Ax = b} is an affine set, then the subproblem 
(2.5) is solved by u = Pc(—e _1 ft), where 

p c{y) = V ~ A\Ay - b). (3.10) 

1 Without a sign change, fzero is unreliable; it fails on the simple quadratic x 2 — 0.0001 = 0 with 
starting point 0.2. 
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Table 3.1: List of domains C where ip(e) = 0 can be solved explicitly 


defining constraint c(x) 

solution 

Ax = b 

Proposition 3.1 

(a, x) = b 

Corollary 3.2 

(a,x) < b 

Proposition 3.3 

x>0 

Proposition 3.4 

Ma<{ 

Proposition 3.5 


and 


e = 


-fa + \/0l -4/3i/? 3 
2/3l 


(3.11) 


with 

Pi--=\\\A'bf 2 + Qo, fh:=(A\Ah),rfb) +1 , fo:=\\\A\Ah)\\l+ l -\\h\\l (3.12) 


Proof. The projection operator on C is given by (3.10). This and y = —e 1 h give 
Pc{—e~ l h ) = — e~ 1 (A\Ah + eb) — h). 


This, together with (3.7), yields 

eQ{u) + 7 + {h,u} = e ^(||(—e^ x /j.) |||) + + 7 + (K Pc\-^ l h)) 

= \\\A\Ah + eb )||1 + l -\\h\\\ - (A*{Ah + eb),h) + Q 0 e 2 
+ ye + (A^ (Ah + eb) — h, h) 

= Qll^ + Qo) e 2 + ((A\Ah),rfb) +1 ) e 

+ l\\A\Ah)\\l+ l -\\h\\l 

= j3\e 2 + fae + 03 = 0 , 


where 0i, 02, and 03 are defined in (3.12). Since the subproblem (2.5) is the maxi¬ 
mization, the bigger root of this equation is selected, which is given by (3.11). □ 
Corollary 3.2. IfC = { x GV \ a T x = b} is a hyperplane, then the subproblem 
(2.5) is solved by u = Pc(~ e _1 ft), where 


Pc(y) = y 


( (a,y) ~b \ 

V INI! ) 


a, 


(3.13) 


and e is given by (3.11) with 
b 


0i ■■= 


2||a| 


2 + Qo, 02 I ’.1+ 7; /?3 : - 


1 (a, h) 2 
2^“ 


2 

2 


2 

2- 


(3.14) 
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Proof. Since the hyperplane C = {x € V | a T x = 6} is an affine set, this is a 
special case of Proposition 3.1. □ 

Proposition 3.3. If C = {x £ V \ (a,x) < b} is a halfspace, then the 

subproblem (2.5) is solved by u = Pc(—e~ l h), where 

Pc(y)=y -n p - a (3.15) 

IMI2 

and e is given by (3.11) with 

Pi-=Qo, p2-=l, P 3 ■= -\\\h\\l, (3-16) 

say e \, and with pi, P2, and pz is given in ( 3 . If), say e 2. If ( a,h) > ef x b and 
(a,h ) > ef 1 !, then e = e\. If ( a,h ) < ef^b and ( a,h ) < e^b, then e = e2. If 
( a,h) > ef 1 b and ( a,h ) < ef 1 b, then e = max{ei,e2}. 

Proof. The projection operator on C is given by (3.15). This gives 

Pc{ - € -> h)= - e -> (*+ (<a ’y- «). ( 3 - 17 ) 

If (a, h) > —eb, we obtain 

-Pc(—e _1 h) = —e~ 1 h, 


leading to 

eQ(Pc(-e _1 h)) +7 + (h, P c (-e _1 h)) = ^e _1 ||*.||| + Q 0 e + 7 - e^ 1 ||/i||l 

= Qoe 2 +7 e — — 1| /*-||| = Pie 2 + /? 2 e + Pz = 0, 

where Pi := Qz, P 2 := 7, and /? 3 := — (||/i|||. This identity leads to a solution of 
the form (3.11), say ei. If (a,h) < —eb, (3.13) is valid and e is computed by (3.11) 
where Pi, P2, and pz is defined in (3.14), say e 2 - After computing ei and e 2 , we 
check whether the inequalities ( a,h) > —ei6 and (■ a,h ) < —e 2 b are satisfied. Since 
the subproblem (2.5) has a solution, at least one of the conditions has to satisfied. If 
one of them is satisfied, the corresponding e and (3.17) give the solution. If both of 
them hold, we consider the solution with bigger e. □ 

Proposition 3.4. If C = { x £ R" | x, > 0 i = 1, • • • , n} is the nonnegative 
orthant, then the subproblem (2.5) is solved by u = Pc{— e _1 h), where 

Pc(y) = (: y)+ (3.18) 

and e is given by (3.11) with 

Pi~Qo, p2~ 7, P 3 :=\\\(h)-\\ 2 2 -(h,(h)-). (3.19) 

Proof. The projection operator on C is given by (3.18) leading to 


P c (-e 1 h) = -e 1 (h)-. 
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This and (3.7) imply 

eQ(-Pc , (-e _1 /i)) +7 + {h, Pc(~e~ l h)) = ^e _1 || (/z-)_ ||1 + Q 0 e + 7 - e _1 (/i, (ft)_) 

= Qoe 2 +7e+ ^||(/i)-lll - (h,(h)~) 

= /3ie 2 + /^e + /?3 = 0, 


where Pi, P 2 , and /?3 are defined in (3.19), giving the result. □ 

Proposition 3.5. Let C = {x € K n | ||x ||2 < £} be the Euclidean ball. Then 



£y/\\vh 

y 


Nl2>£, 

\\vh<t, 


If ||e l h ||2 < ( where e is given by (3.11) with 


Pi~Qo, P 2 '■= 7) /?3 := 


(3.20) 


(3.21) 


then u = — e l h; otherwise, the solution of OSGA’s subproblem (2.5) is given by 


u = — 



e 


2(7 + £IH| 2 ) 

£ 2 + 2 Qo 


Proof. The projection operator on C is given by (3.20), leading to 


Pci-e-'h) = 


-fh/\\h\\ 2 

—e 


\\hh > e£, 

11 11 2 < e£. 


We first assume that ||/z .||2 < e£ implying Pc{—e l h) = —e 1 h. Substituting this 
into (3.7) yields 

eQ(Pc(-e _1 ft.)) + 7 + (h, Pc(-e _1 /i)) = ^e _1 ||*.||| + Q 0 e + 7 - e^ 1 ||/i||l 

= Qoe 2 +7e- ^||/i|| 2 = Pie 2 + P 2 e + P 3 = 0, 

where pi := Q 0 , P 2 '■= 7, and /?3 := — 2 ll^lli- Hence e is given by (3.11). If this 
e satisfies ||/i|| 2 < e£, then u = —e~ 1 h. Otherwise, we assume that ||ft.|| 2 > e£. 
Substituting Pc(—e _1 ft.) = — £/i/||/i ||2 into (3.7) yields 


e 



+ 7-£IW|2 =0, 


implying 

2(7 + e||fe|| 2 ) 

£ 2 + 2Qo 

and u = —£h/\\h\\ 2 - This completes the proof. □ 

To solve bound-constrained problems with OSGA, we developed and algorithm 
that can find the global solution of the subproblem (2.5) by solving a sequence of one¬ 
dimensional rational optimization problems, see Algorithm 3 in [6]. Notice that the 
constraint C := {x £ V | H^Hoo < is a special case of bound-constrained problem 
with x = —£1 and 5; = £1 where 1 is a n-dimensional vector with all elements equal 
to unity. 
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4. Solving structured problems with a functional constraint. In this sub¬ 
section we consider the structured convex constrained problem 

min f(Ax) , . 

s.t. <j>(x) < £, ' ' 

where <f> : C — > R is a simple smooth or nonsmooth, real-valued, and convex loss 
function, and £ is a real constant. We call the problem (4.1) a functional constraint 
problem. While it the special case of (3.1) with 

C:={x£V | <t>(x) < £}, 

one can solve OSGA’s subproblem (2.5) directly by using the KKT optimality condi¬ 
tions, especially when no efficient method for finding the projection on C is known. 
Indeed, if a nonsmooth problem can be reformulated in the form (3.1) with a smooth 
f and a nonsmooth <f>, then OSGA can solve this nonsmootli problem with the com¬ 
plexity of the order 0(£ -1 / 2 ), which is optimal for smooth problems. 

Example. 4.1. (Linear inverse problem) Let A : R n -> R m be an ill- 
conditioned or singular linear operator and y £ R m be a vector of observations. The 
linear inverse problem is the quest of finding x € R" such that 

y = Ax + v, (4.2) 

with unknown but small additive noise v € R m . The problem is solvable if one knows 
additional qualitative information about x. This qualitative information is encoded in 
a constraint on x, under which the Euclidean norm of v is minimized. Constrained 
optimization problems resulting from two typical qualitative constraints are 

min \\\y-Ax\\l , . 

s.t. IM| 2 <£, 


min \\\y~Ax\\l 
s.t. ||a:||i, 2 <£, 


in which f is a nonnegative real constant. This problem often occurs in applied sciences 
and engineering, see [44, 64]- 

In the reminder of this section we assume that the functional constraint satisfies 
the Cottle constraint qualification [10] 

(HI) For all x £ C, either <j>(x) < 0 or 0 ^ d<j>(x). 

We also need the following result. 

Proposition 4.1. (see, e.g., [5]) Let <f) : V —► R, 4>{x) = ||z||. Then the 
subdifferential of <f> is 


I {5 I \\g\U < 
l {.91 IMI. = 1, {g,x) = INI} 


if x = 0 , 
if x ^ 0 . 


Moreover, if || • || is self-dual, then 


d<j>(x) 


{g I IMI* < 1} if x = o, 

x/||a;|| if x 7 ^ 0 . 


The next result gives the optimality conditions for solving the problem (3.1). 
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Theorem 4.2. Let (HI) satisfies for the problem (f.l). Then, for a real constant 
the solution u of OSGA’s subproblem 


satisfies either 


min 

s.t. 


~7 ~ (h, x) 
Q{x) 

<f>{x) < £> 


u — —e l h, p = 0 , 4>{u) < f 


(4.5) 


~ 2 t- e d(l>{u), p > 0, (j>(u) = (4.6) 

P Q{u) 

where e := —(7 + {h, u))/Q(u). 

Proof. Let’s define the function 

£ 7 >h :C->R, E^ h (x) :=- 1+ Q ^' x) . 

Since this function is differentiable, by differentiating both sides of the equality 
E^ : h{x)Q(x) = —7 — (h,x) with respect to x, we obtain 

op / N ( -E 1}h (x)x- h\ 

dE ^*) = \ — 55 )—}■ < 4 - 7 > 

In view of the KKT optimality conditions for inequality constrained nonsmooth prob¬ 
lems, see [ 10 ], we have the optimality condition 


! 0 € dE^ th {u) + pd<f>(u), 

<t>(u) < L 

P> 0, 

p(<f>(u)-£) = 0 , 


(4.8) 


for (3.1). Now, by substituting (4.7) into (4.8), setting e := —( 7 + ( h,u))/Q(u ), and 
distinguishing between p = 0 and p > 0, we obtain either (4.5) or (4.6). □ 

Theorem 4.2 gives the optimality conditions for general function <f>, however, in 
view of Theorem 3.3, it is especially useful when the projection in C = {x \ <p(x) < £} 
is not efficiently available. In the remainder of this subsection, we derive the solution 
of OSGA’s subproblem (2.5) for some <j> such as || • H 2 and || • Hi^ that appear in 
many applications. We already solve OSGA’s subproblem (2.5) with the constraint 
C = {x | ||a :||2 < £} in Proposition 3.5, but to show how to apply Theorem 4.2 we 
study it in the next result. 

Proposition. 4.3. Let V be a real finite-dimensional Hilbert space with the 
induced norm <j)(-) = || • H 2 - Then OSGA’s subproblem (2.5) is solved by 


= —e 1 h, e = 


—02 + V@2 ~ 4/3l/?3 


P = 0 , 


Pl-=Qo, $2~1, @3 ■■= l,\\h\\l, 


where 
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if 4>(u) < £; Otherwise it is solved by 

1 . 2||/ t || 2 (7||/ t || 2 +g||^|| 2 ) _ 2(||*|| 2 + eQ\\h\\l 

\\h\\ 2 ’ emi + 2 Q 0 \\m ■’ A ii*iii + 2q„ii*iii' 


Proof. Since || • || 2 is self-dual, Proposition 4.1 implies 


d(j>(u) 


{g&V* | || 5 || 2 <i} 

u 

IHl* 


if u = 0, 
if u 0. 


As u = 0 is not useful in our optimization setting, we seek only u 0. We now apply 
Theorem 4.2 leading to two cases: (i) (4.5) holds; (ii) (4.6) holds. 

Case (i). The condition (4.5) holds. Then we have u = —e~ 1 h. By substituting 
this into the identity E li h(u) = e, we get 



implying 

Q 0 e 2 +ye - ^\\h\\% = 0. 

By using the bigger root of this equation, we have 


e = 


-ft + \Z/?2 -4/3l/?3 
- 2 / 0 ! 


where ft = Q 0 , ft = 7 , and ft = |||*|| 2 . 

Case (ii). The condition (4.6) holds. Then we have 


giving 


leading to 


—eu — h u 

IlMIl + Qo m IM| 2 ’ 

(-eu - h)\\u\\ 2 + n Q||u||! + u = 0, 


(-e||«|| 


-H\\u\\% + nQ 0 )u = \\u\\ 2 h. 


(4.9) 


This implies that there exist A such that u = A h. By substituting this into 4>(u) = 

IM| 2 = £ we get 



Now, substituting u into (4.9), we obtain 

= fi+e« 

' 11*111+ 2Qo||*||l ■ 


(4.10) 
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It follows from E 7t h(u) = e that 

2||h|| 2 ( 7 ||h|| 2 +$||h|| 2 ) 

e\Ml + 2Q 0 \\h\\l ' 

This gives the result. □ 

In 2004, Yuan and Lin in [70] proposed an interesting regularize!' called grouped 
LASSO for the linear regression. Later Kim et al. in [44] proposed a constrained 
ridge regression model using the constraint 

Nil,2 < £, 


where 


Nil,2 “ElhJk 

1=1 

where x = ( x gi , • • • , x 9m ) and || ai|| 1,2 is a so-called the 1 1.2 group norm. We consider 
this constraint in the next result. 

Proposition. 4.4. Let V be a real finite-dimensional vector space with the in¬ 
duced norm <fif) = || • Hi^- Then OSGA’s subproblem (2.5) is solved by 

u 9i = —e~ 1 h 9i for alii = 1, • ■ • ,m, 


and 


where 


e = 


—P 2 + VP 2 ~ ^0103 

-20i 


p = 0, 


1 ill 

Pi~Qo, 02~ 7, 02 ■= 2 IWI 2 -^2 \\h gi \\h 

i=1 


if (j)(u) < £; Otherwise it is solved by 


u \\ h 9ih - T (h? + Qo) c u . 1 

Ui = Pih gi , Pi = - _ ML n - for all 1 = !,■■■ , to, 


4 K 


9i 112 


and 


7 + (h,u) 2( 7 + Er =1 ^||Vlli) _ 2(£™ 1 ||h !?i || 2 + 0 

^ 1 ^ ^ n. on , no . 1 /-^ 


W + Qo EILi^IIMI + SQo’ ^ m(E I = 1 Tf||V||| + 2Qo)- 


Proof. Similar to Proposition 4.3, we consider it/0. In view of Proposition (4.1), 
we get 


d<Ku gi ) = 


U 9i 

IKJ 2 


for all i = 1 , • • • , to , 


dcj){u) 


* 9 1 


l 9l 11 2 



leading to 
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We now apply Theorem 4.2 leading to two cases: (i) (4.5) holds; (ii) (4.6) holds. 

Case (i). The condition (4.5) holds. Then we have u gi = —e~ 1 h gi for i = 1, • • • ,n. 
By substituting u = (u 9l , ■ ■ • ,u 9ri ) into the identity E^^iu) = e, we get 

, -7 + E^rllVllle- 1 

lll^ll le-i+Qo ’ 


implying 


Qoe 2 +je + -\\h\\l - E \\h 9i \ \\ = 0. 


By using the bigger root of this equation, we get 

—02 + V02 - 4/3i/?3 


e = 


-201 


where /3i := Q 0 , 0 2 := J, and 0 3 := \\\h\\l - EHi II^Ill- 
Case (ii). The condition (4.6) holds. Then we have 


-eu gi - h gi 


= “Mi 


IlNIl + Qo Il%j 2 

Since <j>(u) = ||it|| = £, we equivalently get 

e 


for alii = 1,• • ■ ,m. 


2 IMli + j 


+ 




2 \\u\\i+Qo IK.Ib 


U 9 i = 


implying u gi = T0i gi . If h gi = 0, then u gi = 0. Now let h g . ^ 0. Substituting 
u 9i = Tih gi into the previous identity, it follows that 


v i—1 


+ Qo ) ( 1 sr^rri 

/ V 2 2^i =1 


li T nhgM + Qii + n\\h g% h) TiK V 


giving 


-eTi\\h gi \\ 2 + H (^YljiWhg.Wl + Q^j = || h. 


gi ||2 for all i = 1, • • • , m. 


Applying a summation from both sides, together with YliLi r i|l^sil |2 = £, yields 


-et + mfi ( ^J2 T i\\h gi \\ 2 2 + Qo) =ElK 4 ||2, 


(4.11) 


implying 


M = 


^EZi\\h gi h + eO 


w(E™i T t\\h g% \\l + 2Q 0 )' 

By substituting this into (4.11), we have 

/ ™ 

1 

( m \\h gi \\2 - 

i=l 


Ti = ~ 


™e\\h gi \\ 2 


m \\ h 9 ih -^ 2 \\h gi \\ 2 - 



OSGA FOR CONVEX OPTIMIZATION IN SIMPLE DOMAINS 


17 


leading to 


^ — (T/ir/i , * * * , Tmhgm )• 

By substituting this into E^^{u) = e, we get 

7 + <M) _ 2( 7 + Et 1 rf||fe 9 ,|||) 

|e 2 + Qo Er=i^llVlli + 2Qo’ 

giving the result. □ 


5. Numerical experiments. A software package for solving unconstrained and 
simply constrained convex optimization problems with OSGA is publicly available at 
http://homepage.univie.ac.at/masoud.ahookhosh/. 

The package is written in MATLAB; it uses the parameters 

S — 0.9, CX 7nax 0.7, /t — K — 0.5, 'Ihaj.get — 00. 

and the prox-function (3.5) with Qq = ^ II ||2 + e, where e is the machine precision. 
A user manual [2] describes the design and use of the package. Some examples are 
included as illustrations. 

This section discusses numerical results and comparisons of OSGA with some 
state-of-the-art first-order solvers on some ridge regression and image deblurring prob¬ 
lems. All numerical results were created with version 1.1 of the above software. The 
algorithms used for comparison use the default parameter values reported in the corre¬ 
sponding papers or packages. All numerical experiments were executed on a Toshiba 
Satellite Pro L750-176 laptop with Intel Core i7-2670QM processor and 8 GB RAM. 

5.1. Ridge regression. In this subsection we consider a ^-constrained least 
squares of the form (4.3) (so-called ridge regression, see [47]) and report some numer¬ 
ical results. 

The problem is generated by 

[A, z, x] = i_laplace(n), y = z + 0.1 * rand, 


where n = 5000 is the problem dimension and i_laplace.m is an ill-posed test prob¬ 
lem generator using the inverse Laplace transformation from Regularization Tools 
MATLAB package, which is available in 

http://www.imm.dtu.dk/~pcha/Regutools/. 

Since (4.3) is smooth and the projection on (7 = {i € R" ||a:|| < £} is available 
(see Table 1), we employ gradient projection algorithm (PGA), the spectral gradient 
projection [23] with the Grippo et al. nonmonotone term [37] (SPG-G), the spectral 
gradient projection with the Amini et al. nonmonotone term [8] (SPG-A), and OSGA 
(see Proposition 4.3) to solve this minimization problem. The parameters of SPG-G 
and SPG-A are the same as those reported in the associated papers, but SPG-A uses 


f W 2 if k = 1, 

\ (Vk-i + %-2)/2 if k > 2. 


The algorithms are stopped after 500 iterations. 

In Table 3 we consider £ = 10,15, 20, 25 and report the best attained function 
values and the running time. The results imply that OSGA attains the best running 
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Table 5.1: Result summary for the ridge regression 



c 

PGA 

SPG-G 

SPG-A 

OSGA 

fb 

10 

101.70e-3 

7.60e-3 

6.41e-3 

3.60e-3 

Time(s) 


77.78 

30.08 

31.20 

22.09 

fb 

15 

48.23e-3 

1.70e-3 

1.31e-3 

1.52e-3 

Time(s) 


66.54 

25.00 

24.24 

21.55 

fb 

20 

23.08e-2 

2.Ole-2 

1.74e-2 

8.60e-3 

Time(s) 


64.60 

28.47 

27.11 

21.40 

fb 

25 

23.00e-2 

2.22e-2 

1.24e-2 

8.96e-3 

Time(s) 


62.55 

30.20 

31.18 

26.50 


time and except for £ = 15 gives the best function values. To see the results of 
implementation in more details, we demonstrate the relative error of function values 

4 := (5.1) 

fo~f 

in Figure 1, where / denotes the minimum and / 0 shows the function value on an 
initial point Xq. 

5.2. Image deblurring with nonnegativity constraint. As discussed in Sec¬ 
tion 3, inverse problems are appearing in many fields of applied sciences and Engi¬ 
neering. This is particularly happen when researchers use digital images to record and 
analyze results from experiments in many fields such as astronomy, medical sciences, 
biology, geophysics, and physics. In these cases, observing blurred and noisy images 
is a common phenomenon happening frequently because of environmental effects and 
imperfections in the imaging system. 

In many applications, the variable x describes physical quantities, which is mean¬ 
ingful if each component of x is restricted to be nonnegative. This constraint is 
referred as the nonnegativity constraint; it is especially useful for restoring blurred 
and noisy images, see [11, 42, 43, 66]. 

We restore the 256 x 256 blurred and noisy MR.-brain image using the model (3.2) 
equipped with the isotropic total variation regularizer. The true image is available in 
http://graphics.stanford.edu/data/voldata/. 

The blurred/noisy image y is generated by a 9 x 9 uniform blur and adding a Gaussian 
noise with zero mean and standard deviation set to 10~ 3 . For restoring the image, we 
use OSGA (see Proposition 3.4), MFISTA (a monotone version of FISTA proposed by 
Beck & Teboulle in [17]), ADMM (an alternating direction method proposed by 
Chan et al. in [30]), and PSGA (a projected subgradient scheme with nonsummable 
diminishing step size), see [27]. The original codes of MFISTA and ADMM provided 
by the authors are used. Since the methods are sensitive to the regularization param¬ 
eter A, three different regularization parameters are used. The algorithms are stopped 
after 100 iterations. The comparison concerning the quality of the recovered image is 
made via the so-called peak signal-to-noise ratio (PSNR) defined by 

( a / 777 77 \ 

I, V n (5-2) 

\\x-Xt\\ F J 
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(a) <5*. versus iterations, £ = 10 (b) <5j, versus iterations, § = 15 




(c) <5fc versus iterations, £ = 20 


(d) (5j, versus iterations, § = 25 


Fig. 5.1: A comparison among PGA, SPG-G, SPG-A, and OSGA for solving the 
problem (4.3) based on the relative error of function values 5^ (5.1). The algorithms 
were stopped after 500 iterations. 


and the improvement in signal-to-noise ratio (ISNR) defined by 

ISNR = 20 log 10 (, (5.3) 

VIf — x t\\F J 

where || • ||f is the Frobenius norm, Xt denotes the mxn true image, y is the observed 
image, and pixel values are in [0,1]. The results of implementation are summarized 
in Table 4 and Figures 2 and 3. 

In Table 4 we report PSNR, the best available approximation ft, of the minimi- 
mum, and the running time in seconds for three different regularization parameters. 
The results reported in Figure 2 regarding function values and ISNR show that the 
algorithms considered are sensitive to the parameter A, however, the best results 
obtained for A = 10 -4 . More specifically, the results about function values in sub¬ 
figures (a), (c), and (e) demonstrate that OSGA outperforms PSGA, which means it 
performs much better than the lower complexity bound 0(s~ 2 ), however, it cannot 
perform similar to MFISTA attaining the complexity of the order 0(e -1 / 2 ). Subfig¬ 
ures (b), (d), and (f) show that OSGA is comparable with MFISTA and ADMM and 
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(a) <5fc versus iterations, A = 5 X 10 4 (b) ISNR versus iterations, A = 5 X 10 4 




(c) <5fc versus iterations, A = 1 X 10 4 (d) ISNR versus iterations, A = 1 X 10 4 




(e) 5^ versus iterations, A = 5 X 10 5 


(f) ISNR versus iterations, A = 5 X 10 5 


Fig. 5.2: A comparison among PSGA, MFISTA, ADMM, and OSGA for deblurring 
the 256 x 256 MR-brain image with the 9x9 uniform blur and the Gaussian noise 
with deviation 10 -3 . The algorithms were stopped after 100 iterations. Subfigures 
(a), (c), and (e) display the relative error of function values 5k (5.1) versus iterations, 
and Subfigures (b), (d), and (f) display ISNR (5.3) versus iterations. 
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(a) Original image 


(b) Blurred/noisy image 



(c) PSGA: / = 0.1174, PSNR = 33.24, T = (d) MFISTA: / = 0.0653, PSNR = 34.45, T = 

1.15 6.51 




(e) ADMM: / = 0.0651, PSNR = 34.49, T = (f) OSGA: / = 0.0669, PSNR = 34.46, T = 

1.06 1.97 

Fig. 5.3: Deblurring of the 256 x 256 MR-brain image with the 9x9 uniform blur 
and the Gaussian noise with deviation 10 -3 by PSGA, MFISTA, ADMM, and OSGA 
with the regularization parameter A = ICR 4 . The algorithms were stopped after 100 
iterations. 
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Table 5.2: Result summary for L22ITV 



A 

PSGA 

MFISTA 

ADMM 

OSGA 

PSNR 


32.59 

32.67 

32.66 

32.73 

fb 

5 x 10“ 4 

0.3528 

0.3079 

0.3080 

0.3149 

Time(s) 


1.14 

7.61 

1.11 

1.82 

PSNR 


33.23 

33.96 

33.95 

33.97 

fb 

1 x 10“ 4 

0.1184 

0.0960 

0.0958 

0.0980 

Time(s) 


1.14 

7.34 

1.04 

1.71 

PSNR 


33.24 

34.45 

34.49 

34.46 

fb 

5 x 10“ 5 

0.1174 

0.0653 

0.0651 

0.0669 

Time(s) 


1.15 

6.51 

1.06 

1.67 


even better than them in the sense of ISNR. The deblurred images by the algorithms 
considered are illustrated in Figure 3 for A = 10“ 4 . 

We also consider the restoration of the 641 x 641 blurred/noisy Dione image using 
(3.3). The true image is available in 

http://photoj ournal.jpl.nasa.gov/Help/ImageGallery.html. 

The blurred/noisy image is constructed from the 7x7 Gaussian kernel with standard 
deviation 5 and salt-and-pepper impulsive noise with the level 50%. To recover the 
image, we use DRPD-1, DRPD-2 (Douglas-Rachford primal-dual schemes proposed by 
Bo? & Hendrich in [25]), ADMM, and OSGA. The algorithms are stopped after 100 
iterations, and three different regularization parameters are considered. The results 
of implementation are reported in Table 5 and Figures 4 and 5. 

The results of Table 5 shows that OSGA outperforms the others in the sense of 
PSNR. Figure 4 indicates that OSGA attains the best function values for A = 10“ 1 
and A = 5 x 10“ 2 , however, ADMM get the best function value for A = 5 x 10 _1 . 
It also implies that OSGA are comparable or even better that the others regarding 
ISNR. The resulted images for A = 10 _1 are illustrated in Figure 5, demonstrating 
that the algorithms can restore the image by acceptable qualities while OSGA obtains 
the best function value and PSNR. 


Table 5.3: Results summary for L1ITV 



A 

DRPD-1 

DRPD-2 

ADMM 

OSGA 

PSNR 


37.43 

36.66 

37.42 

37.50 

fb 

5 x 10" 1 

1.0352e+5 

1.0365e+5 

1.0293e+5 

1.0326e+5 

Time 


10.86 

6.83 

8.57 

9.01 

PSNR 


38.70 

38.11 

38.35 

38.73 

fb 

1 x 10“ 4 

1.0324e+5 

1.0294e+5 

1.0281e+5 

1.0281e+5 

Time 


10.43 

6.68 

8.46 

8.32 

PSNR 


37.09 

36.77 

30.06 

37.06 

fb 

5 x 10“ 2 

1.0336e+5 

1.0321e+5 

1.0312e+5 

1.0299e+5 

Time 


10.26 

6.27 

8.25 

9.23 
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(a) 6k versus iterations, A = 5 X 10 1 (b) ISNR versus iterations, A = 5 X 10 1 




(c) 6k versus iterations, A = 1 X 10 1 (d) ISNR versus iterations, A = 1 X 10 1 




(e) 6k versus iterations, A = 5 X 10 2 (f) ISNR versus iterations, A = 5 X 10 2 


Fig. 5.4: A comparison among DRPD-1, DRPD-2, ADMM, and OSGA for deblur¬ 
ring the 641 x 641 Dione image with the various regularization parameter A. The 
blurred/noisy image was constructed by the 7x7 Gaussian kernel with standard de¬ 
viation 5 and salt-and-pepper impulsive noise with the level 50%. The algorithms 
were stopped after 100 iterations. Subfigures (a), (c), and (e) display the relative 
error of function values 6k (5.1) versus iterations, and (b), (d), and (f) demonstrate 
ISNR (5.3) versus iterations. 
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(c) DRPD-1: / = 1.0324e + 5,PSNR = 
38.70, T = 10.43 



(d) DRPD-2: / = 1.0294e + 5, PSNR = 
38.11, T = 6.68 



(e) ADMM: / = 1.0281e + 5, PSNR = (f) OSGA:/ = 1.0281e + 5, PSNR = 38.73, T = 

38.35, T = 8.46 8.32 

Fig. 5.5: Deblurring of the 641 x 641 Dione image using DRPD-1, DRPD-2, ADMM 
and OSGA with the parameter A = ICR 1 . The algorithms were stopped after 100 
iterations. The blurred/noisy image was constructed by the 7x7 Gaussian kernel 
with standard deviation 5 and salt-and-pepper impulsive noise with the level 50%. 
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6 . Conclusions. In this paper an optimal subgradient method, OSGA, is ad¬ 
dressed for solving structured convex constrained optimization. More specifically, 
finding a solution of OSGA’s subproblem is investigated in the presence of some 
convex constraints. Two types of convex constraints are considered, namely, sim¬ 
ple convex domains, in which the orthogonal projection in the domains is effectively 
available, and functional constraints, defined as the sublevel sets of simple convex 
functions. In each case some interesting examples are discussed for which OSGA’s 
subproblem can be solved efficiently. Numerical results and comparisons with some 
state-of-the-art algorithms are reported showing that OSGA is efficient and reliable 
for solving convex optimization problems in applications. 
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